perf(autoware_tensorrt_plugins): keep SegmentCSR allocation-free#12555
Merged
mojomex merged 2 commits intoJun 15, 2026
Merged
Conversation
|
Thank you for contributing to the Autoware project! 🚧 If your pull request is in progress, switch it to draft mode. Please ensure:
|
This was referenced May 7, 2026
c90d878 to
971643a
Compare
9057325 to
d416c7b
Compare
Contributor
Author
|
Tested the following: source /opt/ros/humble.bash
colcon build --symlink-install --mixin rel-with-deb-info compile-commands --packages-up-to autoware_ptv3
colcon test --packages-select autoware_tensorrt_plugins --event-handlers console_cohesion+
colcon test-result --verboseThen launched ptv3 with a custom rosbag that already contains a concat pointcloud, and visually confirmed correctness. segmentcsr.webm |
manato
reviewed
Jun 3, 2026
b309100 to
2f6c56c
Compare
2f6c56c to
35ae21a
Compare
74067dd to
a95664c
Compare
This was referenced Jun 10, 2026
a95664c to
88d7c8a
Compare
607313b to
587b46f
Compare
587b46f to
d6e3535
Compare
Initialize the SegmentCSR output buffer directly instead of allocating, filling, copying, and freeing a scratch base buffer on every launch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp>
d6e3535 to
2ff93e3
Compare
tier4-autoware-public-bot Bot
pushed a commit
to tier4/autoware_universe_perception
that referenced
this pull request
Jun 15, 2026
…owarefoundation/autoware_universe#12555) Initialize the SegmentCSR output buffer directly instead of allocating, filling, copying, and freeing a scratch base buffer on every launch. Signed-off-by: Max SCHMELLER <max.schmeller@tier4.jp> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fourth PR in the SegmentCSR split stack.
After the test, contract-cleanup, and n-dimensional
indptrremoval commits, this final commit keepssegment_csr_launchallocation-free by filling the output buffer directly instead of allocating, filling, copying, and freeing a scratch base buffer on every launch.Stack
indptrpath.Benchmark context
This keeps the original #12555 optimization isolated. The prior measurements showed an isolated SegmentCSR-path improvement of about 2.7% on top of #12554 for PTv3-T18, with standalone kernel microbenchmarks showing the per-call allocation removal saving roughly 14 microseconds.